Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial Swift Interop design doc #302

Merged
merged 13 commits into from
Oct 26, 2023
Merged

Conversation

jkoritzinsky
Copy link
Member

Add a design doc for Swift/.NET interop, primarily focused on the runtime layer initially.

cc:
@dotnet/interop-contrib
@dotnet/macios
@lambdageek
@SamMonoRT
@kotlarmilos
@JulieLeeMSFT
@amanasifkhalid
@davidortinau
@jkotas

proposed/swift-interop.md Outdated Show resolved Hide resolved
proposed/swift-interop.md Outdated Show resolved Hide resolved
proposed/swift-interop.md Outdated Show resolved Hide resolved
proposed/swift-interop.md Outdated Show resolved Hide resolved
proposed/swift-interop.md Outdated Show resolved Hide resolved
proposed/swift-interop.md Outdated Show resolved Hide resolved
proposed/swift-interop.md Show resolved Hide resolved
proposed/swift-interop.md Show resolved Hide resolved
Copy link
Member

@AaronRobinsonMSFT AaronRobinsonMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start!

proposed/swift-interop.md Outdated Show resolved Hide resolved
proposed/swift-interop.md Show resolved Hide resolved
1. Use an attribute like `[SwiftSelf]` on a parameter to specify it should go in the self register.
2. Specify that combining the `Swift` calling convention with the `MemberFunction` calling convention means that the first argument is the `self` argument.

The first option seems like a natural fit, but there is one significant limitation: Attributes cannot be used in function pointers. Function pointers are a vital scenario for us to support as we want to support virtual method tables as they are used in scenarios like protocol witness tables. The second option is a natural fit as the `MemberFunction` calling convention combined with the various C-based calling conventions specifies that there is a `this` argument as the first argument. Defining `Swift` + `MemberFunction` to imply/require the `self` argument is a perfect conceptual extension of the existing model.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a this argument as the first argument.

Is this always the case in Swift? I would like to ensure we don't make odd constructs just because. @stephen-hawley Do you know?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make this moot pretty easily by noting that the stable witness table layouts are fragile, and undefined. We should never make use of them directly. However, with enable-library-evolution all virtual calls and protocol calls are available via thunks that use the self pointer. In terms "is the self pointer always first" that depends. When BTfS has to wrap a swift function to make it callable, the general approach is to:

  1. insert a pointer to the instance before the first argument if there is an instance pointer
  2. insert a pointer to memory allocated for the return value before the first argument (which may already be self)

So, no, it is not always the case. However, I have not looked into detail how the swift ABI handles larger value types as return values in enable-swift-evolution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think relying on MemberFunction might be a bit too much magic then. I think being more declarative via something akin to SwiftThis<T> might be more appropriate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make this moot pretty easily by noting that the stable witness table layouts are fragile, and undefined. We should never make use of them directly. However, with enable-library-evolution all virtual calls and protocol calls are available via thunks that use the self pointer. In terms "is the self pointer always first" that depends. When BTfS has to wrap a swift function to make it callable, the general approach is to:

  1. insert a pointer to the instance before the first argument if there is an instance pointer
  2. insert a pointer to memory allocated for the return value before the first argument (which may already be self)

So, no, it is not always the case. However, I have not looked into detail how the swift ABI handles larger value types as return values in enable-swift-evolution.

Re item 1: Is that argument that was introduced a replacement for the self argument as the functions emitted by BTfS were static and did not have a self argument?

Re item 2: For large return types that are passed by value, that is an implementation detail that our JIT/AOT compilers should handle automatically (the primary use case of MemberFunction initially was to help ensure this determination was correct in some MSVC corner cases). The user should never need to declare the "return buffer" argument manually.

proposed/swift-interop.md Outdated Show resolved Hide resolved
Comment on lines 115 to 117
For implementing existing Swift types in .NET, we will require the following tooling:

1. A Roslyn source generator to generate any supporting code needed to produce any required metadata to pass instances of Swift-type-implementing .NET types to Swift.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one way of doing this, but not the only way. One could also use the Swift details as the source of truth, emit source from a small tool and then include those during the compilation step. There is no requirement for a Roslyn source generator here. In fact, doing everything in a small tool ensures that both sides are the same.

I agree this could be done in a Roslyn source generator, but I do not think it is a requirement. Determining the UX needs for the Swift interop will help focus this decision point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are both pros and cons; If it is implemented as a source generator, it is closer to the runtime and is platform-agnostic. On the other hand, if it is integrated into projection tools, it is probably faster but is more platform-specific.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is implemented as a source generator, it is closer to the runtime and is platform-agnostic.

Any small stand alone tool can be pure C# and ship just like a source generator. This is merely about packaging logic, nothing more. There is no requirement that the tooling for this live in the Roslyn process.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source-generator item here would be specifically for generating .NET code based on C# types that implement the Swift APIs. It should only be used for the use case that source-generators are the best case for, generating C# code based on C# code that exists in the same compilation. My thinking is that it would be specifically for generating the type metadata and witness tables. Alternatively, we could use a tool that edits the IL after compilation, but that would almost assuredly break hot-reload.

proposed/swift-interop.md Outdated Show resolved Hide resolved
proposed/swift-interop.md Show resolved Hide resolved
@lemonmojo
Copy link

Very much looking forward to this! 😎

I understand that Swift -> C# interop is not a big focus here but allow me to add my vote for that here as well.
While we found ways to make this work for us (Beyond.NET), first party support would obviously be very much appreciated.

proposed/swift-interop.md Outdated Show resolved Hide resolved
proposed/swift-interop.md Show resolved Hide resolved
jkoritzinsky and others added 2 commits October 23, 2023 13:28
Co-authored-by: Aaron Robinson <arobins@microsoft.com>
proposed/swift-interop.md Outdated Show resolved Hide resolved
proposed/swift-interop.md Outdated Show resolved Hide resolved
jkoritzinsky and others added 2 commits October 23, 2023 13:59
Co-authored-by: Rolf Bjarne Kvinge <rolf@xamarin.com>
…or register based on more feedback and experimentation.
proposed/swift-interop.md Outdated Show resolved Hide resolved
…visually separate out the proposals we're still considering.

Both options characteristics: They both express that the to-be-called Swift function uses the Error Register in the signature and they both require signature manipulation in the JIT/AOT compilers. The first option represents the "error return" as a managed `ref` or `out` parameter. Like with `SwiftSelf`, we would throw an `InvalidProgramException` for a signature with multiple `SwiftError` parameters.

Option 2 provides an alternative to Option 2 as `ref` and `out` parameters are not supported in `UnmanagedCallersOnly` methods. Option 1 likely provides more possible JIT optimizations as managed pointers are more easily reasoned about in some of our JIT/AOT intermediate representations, but supporting it in all of our goal scenarios (P/Invoke, UnmanagedCallersOnly, function pointers) would require some minor language changes to allow the managed pointers in the `UnmanagedCallersOnly` signature for these special types. Option 2 (`SwiftError*`) does not require any language changes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the UnmanagedCallersOnly case, the JIT would need to:

  • Introduce a dummy local
  • Pretend that the address of this local is method argument to allow user code to set it
  • Read the value of the local at the end of the method and store it into the error register

I think it is about at least as complicated to implement as the rejected option 2, for RyuJIT at least. The rejected option 2 may be actually simpler - the JIT just needs to take the SwiftReturn value at the end of the method and store it into the error register and the return value register or return buffer.

Copy link
Member

@jakobbotsch jakobbotsch Oct 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late response, I just only noticed the design document today. My response here is mainly about RyuJIT, and I realize that UX is a large factor too.

For RyuJIT, there are definitely many separate complexities around each of these. For UnmanagedCallersOnly, with SwiftReturn:

  1. For return buffers, the JIT very early transforms these to regular stores. We would still need a separate representation internally to represent the return of the error.
  2. For structs returned in registers the JIT can keep struct values in the more natural representation, but this is tied into multi-reg handling and promotion. Some platforms (e.g. win-x64) does not have multi-reg call returns today, so we would need to do the work to support this for the Swift case if we wanted to keep things enregistered.

I think there are similar complexities in the other direction (when calling a function returning SwiftReturn).

For SwiftError* in the signature you run into another set of problems -- for one, it is not so natural that the parameter/argument differs in a level of indirection to the actual thing we are trying to represent. It also promotes a pattern to the user where they end up using the address of this local in ways that will cause inefficiencies in the codegen.

In the UnmanagedCallersOnly case the suggested implementation by @jkotas sounds reasonable to me. It requires us to special case this dummy local in a number of places because its uses do not appear explicitly in the IR, or otherwise to handle it conservatively (no enregistration). One unfortunate thing about both of these designs is that it makes it possible for the user to conflate error returns with normal returns.

Calling something with the SwiftError* is probably the most difficult of all of these cases to represent. If we want to handle this pattern efficiently (without stack spilling) then I do not think this can be represented like this very long within the JIT -- likely we are going to transform something like

void* error;
var value = CallToSwiftFunction(x, y, &error);

into the RyuJIT IR equivalent of

void* error;
var value = CallToSwiftFunction(x, y);
error = SwiftErrorRegister();

immediately during importation. Here SwiftErrorRegister() is not a call, but some internal IR node. Some work will be needed to make sure the CallToSwiftFunction/SwiftErrorRegister pairs do not get separated during JIT optimizations, but I do not think this will be hard. LSRA work will be needed to make sure that the error register is considered busy until it has been read after CallToSwiftFunction.

I might not understand what exactly option 4 entails here, but if it just means having the JIT generate a callout to a VM function when the error register is non-zero after a Swift call, then I think it would be by far the simplest to implement in RyuJIT.

I think option 3 is implementable in a way that does not leave any performance on the table, by considered the operations as intrinsics. I expect that this effectively amounts to directly exposing the mechanisms that RyuJIT will end up with under the hood to the user. I would probably switch "Set" to a direct "return" intrinsic instead, so that we would end up with intrinsics like:

[Intrinsic]
void* SwiftErrorReturn() => throw null;
[Intrinsic]
T ReturnSwiftError<T>(void* error) => throw null;
[Intrinsic]
void ReturnSwiftError(void* error) => throw null;

SwiftErrorReturn() would return the error returned by the last Swift call. in RyuJIT, it would be transformed directly into a use of the "swift error local" that is defined (by the transformation above) after every call to a Swift function.
The user visible behavior of ReturnSwiftError is an immediate return from the calling function with that error in the error register. A benefit of exposing it like this is that it models that no "normal" return value is necessary (though I am not sure whether or not that is actually how it works in Swift -- can you still look at the value when there was an error return?). It also means that we know which returns are error paths and ensures that these paths do not introduce phantom uses of unrelated values that may affect the happy paths. But the UX of this may not be ideal.

If I had to guess, I would say that option 3 with intrinsic handling is going to be the implementation that is most natural in terms of what RyuJIT will end up with (and thus easiest to ensure that RyuJIT can handle efficiently).

EDIT: I realized I put this under a resolved comment, so I unmarked it as resolved to make it a bit more visible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, the shape you mention for the generated code when calling a method is fine with me. Clang uses the same representation.

In any case, we want to use the same representation for calling Swift via P/Invoke and function pointers and UnmanagedCallersOnly.

Our primary focus is on Apple platforms, so the concerns about win-x64 suggests that we should push Swift on Windows support out from v1 if it will have extra costs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, we want to use the same representation for calling Swift via P/Invoke and function pointers and UnmanagedCallersOnly.

I would say that my main concern for the SwiftError* design is around returns in UnmanagedCallersOnly functions. In addition to being harder to implement than an intrinsic it promotes writing code that makes it hard for the JIT to separate normal and error returns. For example, returning an error from a function with a large struct return is likely going to incur cost that the same function implemented in Swift wouldn't incur. I am not sure whether it matters or not.

Our primary focus is on Apple platforms, so the concerns about win-x64 suggests that we should push Swift on Windows support out from v1 if it will have extra costs.

I think with the SwiftError* design there shouldn't be much difference in handling between the platforms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you still look at the value when there was an error return?

Yes, caller gets a return value and should check for an error immediately after the call.

For example, returning an error from a function with a large struct return is likely going to incur cost that the same function implemented in Swift wouldn't incur.

I believe it depends on how heavily UnmanagedCallersOnly is utilized by projection tools.

Copy link
Member

@kotlarmilos kotlarmilos Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Swift compiler is not clearing out the return buffer if an error is thrown. The caller could still access the return value even though the error register is set.

Good point. How can we verify that? Probably we can try to set the return register before the call and inspect if it has changed after the call.

Copy link
Member

@jkotas jkotas Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLang/LLVM models swift error the same way as the currently proposed model (pretend that the value is passed via addressable memory): https://clang.llvm.org/docs/AttributeReference.html#swift-error-result , with additional twist that it must be the last parameter (shoud we do the same?). It suggests that the proposed model is a decent choice.

We are going to pay interop transition cost on each of the swift calls. Also, larger structs are typically going to be passed using opaque convention that is expensive compared to passing things in registers. It is ok if the current JIT is not able to 100% optimize some cases of error results.

Copy link
Member

@jakobbotsch jakobbotsch Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. How can we verify that? Probably we can try to set the return register before the call and inspect if it has changed after the call.

I verified it in https://godbolt.org/z/Mz58KW9qo. throws_does_not_clear does not write to the return buffer pointed to by rax.

CLang/LLVM models swift error the same way as the currently proposed model (pretend that the value is passed via addressable memory): https://clang.llvm.org/docs/AttributeReference.html#swift-error-result , with additional twist that it must be the last parameter (shoud we do the same?). It suggests that the proposed model is a decent choice.

As far as I can read in https://github.com/apple/swift/blob/main/docs/ABI/CallingConvention.rst, the Swift compiler lowers Swift signatures into simpler signatures before they are presented to LLVM. It means the Swift compiler can just avoid creating any IR that writes to the return buffer pointer when it sees throw expression. I do not think this is an optimization done at the LLVM level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With that said, the design is fine and looks implementable to me, just wanted to point out this difference between throw and using a pseudo local.

I am curious whether we are going to do the same kind of lowering as the Swift compiler appears to be doing for signatures. In particular the frozen struct/tuple lowering looks very complex and not described with a lot of detail in https://github.com/apple/swift/blob/main/docs/ABI/CallingConvention.rst. Will the JIT be expected to handle the lowering or can we rely on seeing simplified signatures for swift calls/UnmanagedCallersOnly?

Copy link
Member

@kotlarmilos kotlarmilos Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLang/LLVM models swift error the same way as the currently proposed model (pretend that the value is passed via addressable memory): https://clang.llvm.org/docs/AttributeReference.html#swift-error-result , with additional twist that it must be the last parameter (shoud we do the same?)

When calling from C# into Swift, if error or self are the last parameters, there is no need to handle additional arguments at the runtime level to match the Swift signature. These arguments will be automatically ignored by the Swift callee function.

proposed/swift-interop.md Outdated Show resolved Hide resolved
@yaakov-h
Copy link
Member

I can see roughly how value types and class types would map over, but what about Swift Extensions?

@jkoritzinsky
Copy link
Member Author

I can see roughly how value types and class types would map over, but what about Swift Extensions?

We're still working on the language projections for features like Extensions and protocols. We'll update this proposal with those designs as we start to work on them.


We have a prototype that uses Rejected Option 1; however, using an attribute has the same limitations as the attribute option for the self register.

Rejected Option 2 provides a more functional-programming-style API that is more intuitive than the `SwiftError` options above, but likely more expensive to implement. As a result, Option 2 would be better as a higher-level option and not the option implemented by the JIT/AOT compilers.
Copy link
Member

@jkotas jkotas Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but likely more expensive to implement.

I do not think that this is measurably more expensive to implement.

Having said that, I am ok with the currently selected SwiftError* design.

@jkoritzinsky jkoritzinsky merged commit b70a81a into dotnet:main Oct 26, 2023
@jkoritzinsky jkoritzinsky deleted the swift branch October 26, 2023 17:50

Like .NET, Swift has both value types and class types. The value types in Swift include both structs and enums. When passed at the ABI layer, they are generally treated as their composite structure and passed by value in registers. However, when Library Evolution mode is enabled, struct layouts are considered "opaque" and their size, alignment, and layout can vary between compile-time and runtime. As a result, all structs and enums in Library Evolution mode are passed by a pointer instead of in registers. Frozen structs and enums (annotated with the `@frozen` attribute) are not considered opaque and will be enregistered. We plan to interoperate with Swift through the Library Evolution mode, so we will generally be able to pass structures using opaque layout.

When calling a function that returns an opaque struct, the Swift ABI always requires the struct to be passed by a return buffer. Since we cannot know the size at compile time, we'll need to handle this case manually in the calling convention signature as well by manually introduce the struct return buffer parameter. Alternatively, we could enlighten the JIT/AOT compilers about how to look up the correct buffer size for an opaque struct. This would likely be quite expensive, so we should avoid this option and just manually handle the return buffer. We can determine the return buffer size by using the value witness table for the struct type. We could always move to the JIT-intrinsic model in the future if we desired.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return buffer in the Swift calling convention is passed in a special register, so we will need some way to establish in the JIT which argument is meant to be a return buffer pointer if it appears as a normal parameter in the signature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.